Welcome to pandas!

4.13 Series数据合并对齐

1、Series合并(自身元素合并)

假如Series数据的每个元素是可迭代的数据(如列表、数据、Series等), 现在需要将其合并,可使用s.str.join()函数,结构如下:

s.str.join(sep)

sep :合并之间的分隔符


import pandas as pd ,numpy as np

s=pd.Series([

[ 98,"100","85" ],[ 63,"75" ],[ "96","41",9,"102" ],[ "ss","dd" ],

pd.Series([ "3","3433","343" ])]) # 如果不是字符类型,则会报”NaN“缺失值

# a=s.map(lambda l:pd.Series(l,dtype="str"))

a=s.map( lambda l:np.array(l,dtype= "str" )) #等同上一行

t=a.str.join( "-" )

print (t)

返回:

0 98-100-85
1 63-75
2 96-41-9-102
3 ss-dd
4 3-3433-343

dtype: object


2、Series数据接位置合并对齐

除了Series对自身可迭代序列元素的合并之外,还可以使用s.str.cat()函数与其他对象合并,但Series数据以及被合并的对象中的每个元素都必须保证是文本类型,s.str.cat()结构如下:

s.str.cat(others=None,sep=None,na_rep=None,join=”left”)

others :与Series合并的对象,如列表,数组、Series,DataFrame均可

sep :合并时的分隔符,默认为空

na_rep :将缺失值设置为指定的定符,如果不指定,others参数有缺失值将不会合并

join :合并时的联接样式,有left,rigth,outer,inner四种。


import pandas as pd

s=pd.Series([ "a","b","c" ])

t1= "-" .join(s)

t2=s.str.cat(sep= "-" )

t3=s.str.join( "-" )

t4=s.str.cat()

print (t1)

print (t2)

print (t3)

print (t4)

返回:

a-b-c

a-b-c

0 a
1 b
2 c

dtype: object

abc


import pandas as pd

s1=pd.Series([ "a","b","c" ])

s2=pd.Series([[ "a","b","c" ],[ "1","2","3" ]])

t1=s1.str.join( "-" )

t2=s2.str.join( "-" ) # 二维数组有迭代对象才能合并

t3=s2.str.join( "-" ).str.cat( sep = "@" )

print (t1)

print (t2)

print (t3)

返回:

0 a
1 b
2 c

dtype: object


0 a-b-c
1 1-2-3

dtype: object

a-b-c@1-2-3


import pandas as pd

s=pd.Series(

data =[ "张三","李四","王五" ],

index =[ "NDE01","EDN04","EDN05" ])

l=[ "39","40","45" ]

t=s.str.cat(l, "-" )

print (t)

返回:

NDE01 张三-39
EDN04 李四-40
EDN05 王五-45

dtype: object


import pandas as pd,numpy as np

arr=np.array([

[ "28","张三","财务部" ],

[ "34","李四","销售部" ],

[ "56","王五","开发部" ]

])

s=pd.Series(

data =[ "张三","李四","王五" ],

index =[ "NDE01","EDN04","EDN05" ])

t=s.str.cat(arr, "-" )

print (t)

返回:

NDE01 张三-28-张三-财务部
EDN04 李四-34-李四-销售部
EDN05 王五-56-王五-开发部

dtype: object


3、Series数据接索引对齐合并

import pandas as pd,numpy as np

s=pd.Series(

data =[ "张三","李四","王五" ],

index =[ "END01","END04","END03" ])

arr=np.array([

[ "28","张三","财务部" ],

[ "34","李四","销售部" ],

[ "56","王五","开发部" ]

])

df=pd.DataFrame(

data =arr,

index =[ "END09","END03","END01" ],

columns =[ "年龄","性别","部门" ]

)

t1=s.str.cat(df,sep= "-" ,na_rep= "None" ,join= "left" )

t2=s.str.cat(df,sep= "-" ,na_rep= "None" ,join= "right" )

t3=s.str.cat(df,sep= "-" ,na_rep= "None" ,join= "outer" )

t4=s.str.cat(df,sep= "-" ,na_rep= "None" ,join= "inner" )

print (t1)

print (t2)

print (t3)

print (t4)

返回:

END01 张三-56-王五-开发部
END04 李四-None-None-None
END03 王五-34-李四-销售部

dtype: object

END09 None-28-张三-财务部
END03 王五-34-李四-销售部
END01 张三-56-王五-开发部

dtype: object

END01 张三-56-王五-开发部
END03 王五-34-李四-销售部
END04 李四-None-None-None
END09 None-28-张三-财务部

dtype: object


END01 张三-56-王五-开发部
END03 王五-34-李四-销售部

dtype: object